{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Save LightGBM model in CatBoost format to use fast CatBoost appliers\n",
    "\n",
    "To save LightGBM in CatBoost format you need to convert LightGBM model to ONNX, and then to convert the model from ONNX to CatBoost.\n",
    "+ Save LightGBM model in the ONNX format\n",
    "+ Load the ONNX model into CatBoost using the load_model() method\n",
    "+ Apply your model in CatBoost using the predict() method or save it as file and use with other appliers\n",
    "\n",
    "Note, that this tutorial will only work for Python 3.* since onnxmltools \n",
    "\n",
    "Let us follow this scenario step-by-step for a LightGBM model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Download the MSRank dataset and import the necessary packages:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from catboost import datasets, CatBoostRegressor\n",
    "\n",
    "from lightgbm import LGBMRegressor\n",
    "\n",
    "import onnxmltools\n",
    "from onnxconverter_common import *\n",
    "\n",
    "\n",
    "train_df, _ = datasets.msrank_10k()\n",
    "X, Y = train_df[train_df.columns[1:]], train_df[train_df.columns[0]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Build a model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1.30604501 1.60390655 0.35207384 ... 1.18672199 0.55631924 0.54655847]\n"
     ]
    }
   ],
   "source": [
    "model = LGBMRegressor()\n",
    "model.fit(X, Y)\n",
    "predict = model.predict(X)\n",
    "print(predict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Save the model in the ONNX format:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The maximum opset needed by this model is only 1.\n",
      "The maximum opset needed by this model is only 1.\n"
     ]
    }
   ],
   "source": [
    "features_count = len(X.columns)\n",
    "onnx_model = onnxmltools.convert_lightgbm(model, name='LightGBM', initial_types=[['input', FloatTensorType([0, features_count])]])\n",
    "onnxmltools.utils.save_model(onnx_model, 'model.onnx')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Load the ONNX model into CatBoost and print predictions to make sure they are correct."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1.30604502 1.60390654 0.35207381 ... 1.18672202 0.55631925 0.54655849]\n"
     ]
    }
   ],
   "source": [
    "catboost_model = CatBoostRegressor()\n",
    "catboost_model.load_model('model.onnx', format='onnx')\n",
    "catboost_predict = catboost_model.predict(X)\n",
    "print(catboost_predict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Save the model in CatBoost format for later use in C++, C#, Java or Python code with fast CatBoost applier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "catboost_model.save_model('model.bin')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "mypy37",
   "language": "python",
   "name": "mypy37"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
